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Evaluating  AT-D-A  for  Sparse  Matrices:  Analysis 

Amnon  Gonen 
Naval  Postgraduate  School 

ABSTRACT 

The  evaluation  of  the  matrix  product  ATA  or  AT-D-A  .where  A 
is  an  mxn  real  matrix  and  D  an  mxm  diagonal  matrix,  is  a  funda- 
mental operation  for  many  algorithms.  We  analyze  the  evaluation 
of  AT-A  for  several  configurations  of  sparse  matrices  A  ail  of  which 
have  the  same  sparsity.  The  complexity  of  the  evaluation  is 
estimated,  and  application  to  certain  problems  of  optimization  are 
given. 

Keywords:  Sparce  'Matrix,  Hessian  evaluation,  Optimization 


1.  INTRODUCTION 

Many  fundamental  algorithms  in  numerical  analysis  include  the 
evaluation  of  AT-A  or  AJ-D-A,  where  A  is  a  real  mxn  matrix  and  D  is  a 
diagonal  mxm  matrix.  Examples  are  given  in  papers  en  factorization  of 
matrices  or  problems  of  minimization  in  which  the  Hessian  has  this  form 
(  Gay  [l]  Gonen  &  Avriei  [3]  ).  The  extended  use  of  this  product  motivates 
the  question  of  reducing  its  complexity. 


This  research  "p.s  partially  supported  by  the  NrS  Foundation  Research  Program. 


The  purposes  of  this  paper  are: 

1.  To  relate  the  computational  complexity  of  AT-DA  to  the  sparsity  rate 
of  the  matrix  A  . 

2.  For  a  given  sparsity  rate,  to  distinguish  between  the  worst  and  the 
best  case. 

3.  To  provide  an  application  of  these  results. 

The  problem  of  multiplying  a  transpose  of  a  sparse  matrix  by  itself 
was  discussed  in  several  books  and  papers  e.g.  George  &  Liu  [2]  in  which 
they  include  the  number  of  operations  required  for  this  multiplication. 
Gustavson  [4]  propcsed  an  optimal  algorithm  for  multiplying  two  sparse 
matrices  A-B  where  AeRn*m  and  SeR™*'-,  proving  that  the  number  oi 
multiplication  N  satisfies  0<N<nmJc  .  However,  the  connection  between 
the  number  of  operation  and  the  sparsity  rate  of  the  matrices  was  not 
discussed. 

Apparently,  it  seems  that  this  question  has  only  theoretical  meaning 
since  the  matrix  A  is  provided  and  therefore  the  number  of  operations  is 
known.  However,  in  this  paper  we  will  see  there  exist  some  cases  in  which 
the  configuration  of  this  matrix  A  can  be  designed  by  the  user.  In  these 
cases  it  make  sense  to  analyze  this  product  in  order  to  reduce  the 
number  of  operations. 

In  section  2  of  this  paper,  we  present  the  computational  complexity 
of  ATD-A  for  several  sparsity  patterns  of  .4.  In  this  section,  we  establish 
our  results  on  the  assumption  that  the  number  of  nonzero  elements  of 


-  3- 

Lhe  matrix  A  is  provided.  VvTe  demonsLraLe  the  best  and  the  worst  case, 
showing  that  in  the  best  case,  the  nonzero  elements  are  divided  homo- 
geneously among  the  rows  of  A,  while  in  the  worst  case,  these  nonzero 
elements  are  confined  in  a  limited  number  of  rows. 

In  section  3  we  provide  an  example  from  optimization  theory,  in 
which  the  matrix  A  is  dense  and  by  applying  the  results  of  section  2  w  t 
minimize  the  number  of  multiplication  in  the  evaluation  of  the  Hessian. 

In  this  paper,  all  vector  spaces  are  finite  dimensional  and  vectors  are 
column  vectors.  The  space  of  all  nxm  matrices  is  denoted  by  Rn'Arn  ;  the 
nonnegative  orthant  of  the  Euclidean  space  R71  is  denoted  by  E%  ;  the 
subset  of  all  integer  vectors  in  Rn  is  denoted  by  I71,  and  its  nonnegative 
orthant  by  F;.  For  a  matrix  A  we  denote  by  c_.  and  a^  the  i-th  row  and 
the  ;'-th  column  respectively.  The  transpose  of  A  is  denoted  by  AT .  By 
the  norm  |  |ar|  |  we  mean  the  Euclidean  norm.  For  a  real  number  r  its 
integer  part  is  denoted  by  fcrj-  Finally,  the  number  of  elements  in  the  set 
B  is  denoted  by  \B\  ,  and  the  number  of  zero  elements  in  a  matrix  A  is 
denoted  by  Z(A). 

2.  THE  COMPUTATIONAL  COMPLEXITY  07  AT  D-A. 


N 

Let  A  be  in  Rm*n  with  N  nonzero  elements.  The  ratio  is  called 

mn 

the  sparsit7  rate  of  the  matrix  .4  and  denoted  by  a(.4).   In  this  section  we 

assume  that  the  sparsity  rate  of  the  matrix  A€-RmVn  is  provided  and  that 


-bl- 
each row  of  A  includes,  at  least,  one  nonzero  element.  We  concentrate  on 
the  sparsity  pattern  of  A,  looking  for  the  best  and  the  worst  cases,  by 
means  of  the  number  of  operations  required  to  compute  AT-D-A  where  D 
is  a  diagonal  matrix  D^BmXm.  We  begin  our  exploration  in  the  worst  case, 
in  which  the  configuration  of  A  implies  the  maximum  number  of  multipli- 
cation.  Let  us  denote  by  rru  the  number  of  nonzero  elements  in  a^>  ,  thus 


IX  =  ,v.  (2.1) 

i  =  l 


Our  first  Lemma  provides  us  the  number  of  operations  (multiplica- 
tions) required  to  accomplish  the  product  A7 -A. 

Lemma  2.1:  Let  AzR7^^-  be  a  given  sparse  matrix,  then  the  product  A7 -A 
can  be  computed  using 

J^X-C^  +  l)  (2.2) 

multiplications. 
Proof:  The  product  AT-A  can  be  rewritten  as  a  sum  of  m  rank  1  matrices 

ATA=y&.  a7,  (2.3) 

Trie  rank  1  matrices  <v  o^  are  symmetric.  Each  nonzero  clement  Oj  ,-  of 
the  vector  a,.,  is  multiplied  by  all  ether  nonzero  elements  a,-,  for  k>j  . 
Therefore,  the  number  of  multiplication  is 

£i  =&7v(mi  +  l-)  (2.4) 


combining  (2.3)  with  (2.4)  yields  the  proof  of  the  lemma. 


From  the  proof  above,  it  can  easily  be  seen  that  the  number  of  addi- 
tions are  approximately  the  same  as  the  number  of  multiplications  since 
each  term  ai|jfe-afc);-  is  accumulated  into  the  result  matrix  C\  C  =  AT  A. 

Ccrcilary  2.1:  Let  A^RmYn   be  a  sparse  matrix  and  D^Rmilm   a  diagonal 
matrix  then  the  product  AT-D-A  can  be  computed  by 

£2*v(™i  +  l)+tf  (2.5) 

multiplications . 

Prccf:  We  first  compute  A  =  D-A  which  requires  N  multiplications  and 

then  substituting  of*  by  u7*  in  (2.3)  yields  the  proof  of  the  corollary. 

In  order  to  find  the  sparsity  pattern  which  yields  the  worst  case,  we 
have  to  maximize  (2.2)  provided  (2.1)  and  all  m^  are  positive  integers. 
Since  the  difference  between  (2.2)  and  (2.5)  is  N,  it  is  enough  to  explore 
the  worst  case  for  the  product  AT-A  that  will  yield  the  same  result  for 
AT  D-A.    Consequently,  a  new  problem,  can  be  formulated  as  follow: 

/ai\  fl  mi '(im  4-1) 

(Al)  max  2, (2.6) 

i  =  i  c 

subject  to  the  constraint 


i  =  l 


and 


l^m^^n  ;  rr^e/1 


(2.7) 


This  problem  can  be  reduced  to  maximizing  Y,mf  under  the  same  con- 
straints.   Defining  xi=mi-l  yields  the  following  problem 


(A2) 


max   \x 


(2.8) 


subject  to  the  constraint 


i  =  l 


and 


(2-S) 


0<ztr<n  -1  ,  x.<El 


n 


(2-10) 


We  will  prove  that  since  the  objective  function  is  convex  its  maximum  is 
attained  at  a  boundary  point.  An  integer  vector  are/71  is  called  a  boundary 
point  of  problem  (A2)  if  there  exists  a  set  J  =  \j,,...J.«\cL  -  \l,...,m]  and  a 
unique  j0cL-J  such  that 


Xt  = 


[N  -m)-^(n-l)    i=jQ 

0  otherwise 


J.  11) 


where 


t5  = 


N  -TTL 


n-1 


(2.12) 


In  this  case  Lhe  vector  x  -x  +e  where  e  =(l,...,l)  is  a  boundary  point  of 
problem  (Al). 


-   / 


Fortunately,  from  the  symmetric  property  of  the  objective  function, 
the  optimal  value  does  not  depend  on  the  selected  boundary  point. 
Hence 


Sib 


(2.13) 


i=i 


To  prove  that  (2.11)  is  the  solution  of  problem  (A2)  we  need  the  fol- 
lowing lemma 


Lemma  2.2  :  Consider  the  integer  problem 


(A3) 


max  |  |  x 
xeln 


!     r*.    I     I    2 


(2-14) 


subject  to  the  constraints 


2>*  =  ir 


i=l 


Q<Xi<M 


(2.15) 


(2  1  3^ 


where  K  and  M  are  positive  integers,  M<K.    Problem  (A3)  has  a  solution 
x*  satisfying 


\x*\  \Z  =  &H2  +  (K-$H)2 


(2.17) 


where  tf  is  the  integer  part  of  —denoted  by 


7i- M>K. 


K 


,  if  and  on]}/  if 


(2.  IS) 


Moreover,  if   (2.18)   holds  then  every  solution  of  problem  (A3)  satisfies 
(2.17). 


Proof:  It  is  immediate  that  if  n-M  <K  then  there  is  no  feasible  solution 
to  problem  (A3).  Therefore,  let  us  assume  that  (2.18)  holds  and  prove 
this  lemma  by  induction  on  the  dimension  of  x.  If  n  =  1,  then  from  (2.15) 
we  have  x  -  K <  M  .  If  K  <  M  then  tf  =  0  and  if  K  -  M  then  #  =  1  .  In  both 
cases  (2.1^)  is  satisfied.  Assuming  the  assertion  is  true  for  all  n,  n< 777,-1. 
Let  us  denote 


F(m  ,M,K)  =  max  [  |  |  x  |  | 2:  £  a*  =  K\  0<^  < M ;  -a:  e/: 


i  =  l 


(2.19) 


hence 


F(m,M,K)=   max  [F(m-l,#,ir-a:m)  +  a:£  ra^e/H  (2.20) 

X «*'— .  — -1  — 


Since  by  the  induction  assumption,  (2.17)  holds  for  m—1 


F(m , M , /T)  =  maxr  ftfitf2  +  ( A"-a^  — oS-JCf  ) 2  +  ~!   :  zm  e/ J  ;  1? : 


iT- 


(2.21) 


la!    0  =  /T  — .  1q  ™ 


J/f    -htI 


T/-|-1  n  l^p 


ri-ipn  n  <  0  is  Of     (  0  i^  th*3  renn  aipd^r  of  (i^v^di!!' 


K  by  .¥  )  Consider  the  maximization  problem  (2.21)  in  two  cases: 


1.         0<xm<p. 


In  this  case  i5  =  ?5  and  the  Droblem  is 


max   Um2  +  (K—#M)S-2(K—&M)- 


Substituting  K--6-M  by  p  ,  yields  the  maximization  of 


&M2  +  p2  -  Spa:^  4-  2a:; 


(2.23) 


subject  to  the  constraint  0^xm<p  and  zm€/|  .    The  maximum  of  (2.23)  is 


-9- 
attained  ai  xm  =p  or  xm  =  0,  and 

|  \x\  \z  =  VM2  +  p2  =  VMz  +  (K-'6M)z.  (2.24) 

2.     p<xm<LM  . 

In  this  case  -5  =  i>  -  1  and  the  problem  is 

max  l($-l)M2  +  (p  +  M)2-2(pi-ll)xm+2x2  |    a^e/i  J  (2.25) 

using  the  same  arguments  as  in  the  first  case,  the  maximum  is  attained 
at  xm  -11,  thus 

|  |x|  \2  =  (&-l)M2  +  (p  +  M)2-2pM=VM2  +  p2  =  '&M2  +  (K-'$M)2.      (2.26) 
In  both  cases  (2.17)  holds,  which  complete  our  proof. 


Applying  Lemma  2.2  to  problems  (Al)  and  (A2)  yields  the  following 
conclusion. 

Corollary  2.3:   Every  xe/+  satisfying  (2. 1 1)  is  a  solution  to  problem  (A2). 

Proof:  Suppose  xeHJ  satisfies  (2.11),  which  mean  that  (2.13)  holds.  Sub- 
stituting M=n-l  and  K-N-m.  in  Lemma  2.2  implies  that  (2.1?)  and 
(2.13)  arc  the  same,  and  Lemma  2.2  implies  that  x  is  a  solution  of  prob- 
lem (A2). 


A  solution  to  problem  (Al)  can  be  established  by  setting 


rrii- 


10 


n  ieJ" 

N-rn  -tf(n  -l)+l    i=j0  (2.27) 

1  otherwise 


where  tf  satisfies  (2.12)  and  J"  is  a  set  of  indices  |/|  =  tf  and  ;'c  is  an  index 
not  in  /.  The  computational  complexity  of  the  product  A7 -A  for  the  worst 
case  is  established  by  substituting  (2.27)  into  (2.6) 

A*u*  =  ;z[>^2  +  (-'?  -™  - #("  -1)  +  I)2  +  ™  -*  ~  1  +#].  (2. 28) 

It  can  be  seen  that  in  the  worst  case  some  of  the  m^-th  achieve  the 
upper  bound  n  ,  the  others  are  zero  and  only  one  of  the  ro^-th  is  some- 
where between  0  and  n.  This  mean  that  the  matrix  A  has  as  many  full 
rows  as  possible,  the  rest  of  the  rows  have  one  element,  and  one  row  con- 
tains the  remaining  nonzero  elements  of  N. 

In  the  next  Lemma  a  new  bound  for  the  computational  complexity  is 
presented  which  enable  us  to  relate  the  sparsity  rate  and  the  mathemati- 
cal effort. 

Lemma  2.4  The  computational  complexity  of  the  product  AT-A  can  be 
bounded  by 

^<&(n(JV-m)+2JV).  *    (2.29) 

Proof:  Let  us  denote  by 

<p(k )  =  (k  -7i2  -f  [(N  - m)  -k  •  (n-1)  +  l]2  +  m  -k  -  1  -i-  N).  (2.30) 

The  first  assertion  is  that 


11 


v( 


jv  —m 


71-1 


)^(~ 7^- 

n  —  I 


(2.31) 


If  we  denote  by  t  = 

71-1 


calculation  yields  that 


N  -m 


71-1 


then  0<^<1  .    A  straightforward 


p(-     ?)  =  (N-m.)-(n  4-  l)  +  m  +  JV. 

n    71  -  1  ^  '    V  ; 


(2.32J 


Hence 


p(  ^     'D  -  <p(  N     T"  -  f )  =  (-'V -77i )•  (n  +  l)+m  4-./V  - 


71-1 


n-1 


(2.53) 


vn-l  *         L  n-1-  ^  n  —  1 

=  (Ar-7yi)-(n  +  l)^77i>A^-[-A/~^(n3-l)  +  77i-!-Ar-t7is-f(t(?i-l)i-ljs+<--l]^ 

=        tn2-(^-l)  +  l)2-«t+l  =  <-(l-<')(n-l)2 

since  0  <  f  <  1  the  last  expression  is  nonnegative  which  prove  our  first 
assertion.  The  rest  of  the  proof  is  established  by  the  following: 


A^=  =  J&( 


N-m 


71  -1 


771 


)^M~^=\Sn(N-m)^2N]. 


(2.34) 


As  we  can  see,  (2.29)  provides  us  an  elegant  bound  for  the  computa- 
tional complexity  of  the  worst  case.    This  bound  is  a  good  approximation 

to  the  computational  complexity  when  — ^-—  is  close  to  its  integer  part. 

The  difference  between  the  mathematical  effort  of  computing  AT-A  in  the 
worst  case  and  this  bound  is  actually  provided  in  the  right  hand  side  of 
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(2.33)  and  it  is 

fc{-(WM«-l)8  '  (2-35) 

whers  {  is  the  fraction  part  of  — — — . 

71—1 

The  bound  in  (2.29)  can  be  expressed  as  a  function  of  the  sparsity 
rate  by  using  the  definition 

c(A)  =  — —  (2.33) 

77171 

which  leads  to  the  following  equality 

/Am  ^  Vi O  (N  ~  m )  +  2^/ ]  =  faun  {o(A )(n  +  2)  - 1) .  (2. 37) 

It  is  interesting  to  observe  the  connection  between  the  bound  in  (2.29) 
and  the  mathematical  effort  to  accomplish  AT-A  without  using  sparsity 
method  which  is 

faf(n  +  l)in.  (2.38) 

The  difference  between  (2.38)  and  (2.29)  can  be  established  by  expanding 
these  two  formulas  achieving 

Jgre(n  +  l)m  -;i[(N  -m)n  -f  2.V]  =  U(n  +  2)(m-7i  -N).  (2.39) 

Dividing  and  multiplying  the  right  hand  side  of  (2.39)  by  77in  yield  the  fol- 
lowing expression  for  the  difference 

%n+2)-7nn(l-a(A))  (2.40) 

where  a(A)  is  the  sparsity  rate  of  A. 
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2.2    The  best  case 


In  our  discussion,  we  call  the  case  in  which  we  need  the  minimum 
number  of  multiplication  to  produce  AT-A  provided  that  there  are  N 
nonzero  elements  in  A  the  best  case  .  The  number  of  operations  in  the 
best  case  can  be  derived  by  minimizing 


m  77y(?^4-l) 
■^  2 

i  =  l  c 


(2.41) 


subject  to  (2.5)  and  (2.7).   Tuthout  the  integer  restriction  it  is  immediate 
that  since  the  objective  function  is  convex,  the  solution  will  be  the  arith- 

N 

me  tic  mean,  that  is,  for  all  i  ,  iru  =  — — .  The  restriction  that  all  the  rru  have 

771 

to  be  integral  yields  the  solution 


1 

N 

m 

771*=' 

771 

i'Zl.-J 


+  1 


(2.42) 


where     L  =  $l,...,mj    ,     JzL     and     \L-J\-  N  - 


number  of  multiplication  in  the  best  case  is 


N_ 
m 


77i.     Consequently,    the 


m  7ni*(77xt*+  1)      ,   , 

P-bc  ~  Li  o  ~  /Z\ 

i  =  l 


772. 


AT 


1)-{j£N  -mj  — \) 


M' 


(2.413) 


In  order  to  present  the  magnitude  of  the  difference  between  the  worst 

and  the  best  case,  let  us  assume  that  —  and  "    ~rr^  are  integers,  in  this 

m  n  —  1 


case  (2.29)  holds  with  equality  and 


to*=%^-(N  +  m).  (2-44; 


Subtracting  jj.bG  from  /^  yield 


JV2 


!Av=  -Vbc  =%(nN  -ran  +  JV  -  — )  =  (2.45) 

=  ^0V-77l)(l-(7(/l)). 

If  we  take,  for  example,  N=}£rn(n  +  1)  the  difference  will  be    — *— 


R 


while  /ztc  =  : — — — -jZi— — —    mat  means  that  ,a~.c,  for  large  n,  is  approxi 
mately  50%  more  than  ,uic . 


3.  APPLICATION 

In  this  section  we  present  an  example  in  which  the  product  AT  -D-A  is 
required  where  D  is  a  diagonal  matrix  and  the  pattern  of  A  can  be 
designed  in  order  to  reduce  the  computational  effort.  Since  we  are  dis- 
cussing the  number  of  zeroes  in  matrices,  let  us  denote  by  Z(A)  the 
number  of  zero  elements  in  the  matrix  A.  Consider  the  problem  intro- 
duced by  Gay  [l] 

m. 

(PI)  min  ?(x)  =  ZPiMx))  (3.1) 

i  =  l 

where  ri:Rn->R  ,pt-.R^R  and  m^n.  Very  often  r{x)  -  (r1(flr),...Jr!7l(a;))  is  a 
linear  function  of  x  ,  (see  for  example  Gonen  &  Avriel  [3],  or  the  least 
square  problem  in  Gay  [1])  which  mean 

r(x)  =A-x  -b.  (3.2) 
In  this  case,  the  gradient  and  Hessian  of  50  have  particularly  simple  forms 

V<p(x)  =  AT-p'(r(x))  (3.3) 

^<?{x)  =  ATDA  (3.4) 
where 

P'(r(x))  =  \p'1{T1(x))....tp'n(rn(x))]  (3.5) 
and 

D  =  diag[p"l(r1(x)),...,p"m(rm(x))]  (3.6) 

is  the  diagonal  matrix  with  diagonal  elements  p'^faix))  .  Since  we  have  a 
simple  analytic  presentation  of  the  gradient  and  Hessian  ,  it  is  reasonable 
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to  consider  using  Newton  method  Lo  construcL  a  sequence  of  iterates 
which,  under  reasonable  conditions,  converge  to  a  local  rninimizer.  This 
mean  that  the  product  AT-D-A  will  be  used  each  iteration  and  very  often 
this  computation  is  the  most  expensive  part  of  the  algorithm.  The  main 
idea  is  to  accomplish  an  initial  preparation  step  by  factoring 

A  =  BQ  (3.7) 

where  Q<zRr'*n  is  nonsingular  and  BzR^-*71  has  {nz  -  n)  zeroes  in  it 
{Z{B)  =n).  The  next  step  is  to  substitute  Qz  by  y  in  (3.7)  leading  to  the 
problem 

(P2)  min  <p(x)  =  Y.pdnix))  (3.8) 

i=i 

where 

r(y)  =  Ey-b.  (3.8) 

To  establish  the  connection  between  the  two  problems,  let  us  introduce 
the  following  Lemma: 

Lemma  3.1:  A  point  x*  satisfies  sufficient  conditions  for  minimum  of 
problem  PI  with  r{x)  defined  by  (3.2)  if  and  only  if  y*  =  Qx  satisfies 
sufficient  conditions  for  minimum  of  problem  P2. 

Prcci:  The  sufficient  conditions  for  minimum  of  problem  Pi,  where  r(x) 
satisfies  (3.2),  are: 

AT-V<p(Ax>)  =  0  (3.10) 
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zT-AT-Vz(p(Ax')Az  >  0  (3.11; 

for  all  2^0.   Since  A  =  BQ    where    Q  is  a  nonsingular  matrix  (3.10)   is 
equivalent  to 

BTVcp(By*)  =  0  (3.12) 

and  (3.11)  can  be  rewritten  as 

zT-QT-BT-y2?(£y*)B-Q-z  >0  (3.13) 

for  all  z?*0.  Since  Qz  =  0  if  and  only  if  z  -  0  our  proof  is  completed. 


It  is  important  to  mention  that  from  Lemma  3.1  we  can  deduce  that 
if  A  is  a  nonsingular  square  matrix  then  it  is  enough  to  minimize  p(y)  and 
the  minimizer  x*  will  satisfy  x*  -  A~ly* . 

In  our  next  lemma  we  introduce  a  set  of  matrices  A<^Rm*r~  such  that 
for  every  factorization  of  a  matrix  in  this  set:  A  =  BQ  where  0  is  a  non- 
singular  matrix,  the  matrix  B  will  have  at  most  n2-n  zerces 
(Z(B)  <  n2  -  n ).  Next  we  show  a  practical  method  of  factorizing  a  full 
ranked  matrix  which  achieve  at  least  rJ2  -n  zeroes:  in  general  we  cannot 
expect  more  . 

Lemma  3.2:  Let  AeR™-*71  where  m>n  be  a  full  rank  matrix.  Let  3"  =  [A, -I] 
be  an  m  by  71+771  matrix.  If  any  set  of  m,  columns  of  7L  are  linearly 
independent  then  fcr  every  factorization  A  =  BQ  where  QeRnxn  is  a  non- 
singular   matrix   and   BzRmxn,   the    matrix   B    will   include    ,    at   least, 


lo 


n(rn  +  l)-n2  nonzero  elements,    (that  is,  Z(B)  <  nz-n  ). 

Prcoi:   Consider  the  factorization  AQ~l  =  B  which  can  be  written  as  n 

identical  linear  systems 

A-{Q-l)*j  -I-B*j  =  0       jf  =  l....,n  (3.14) 

The  coefficients  matrix  %=[A,-I]  has  rank  m  and  any  mxm  submatrbc  of 


74.  has  full  rank.  Let  us  denote  by  x  the  vector 


0zl 


<     J  > 


in  R™+n.  First  we 


claim  that  a:  has  at  least  77l-:-i  nonzero  elements.  Suppose  x  has  less  then 
77i+l  nonzero  elements  then  it  has  at  least  n  zero  elements.  Suppose 
xt=x,=  ■  ■  ■   =Xi  =0   and    define    Cei?mxm    to   be    a   submatrix   of   %  with 

-1  -2  TV 

columns  .%  where  ;Vi;;  for  all  l<fc^7i.  According  to  the  lemma's  assump- 
tion, C  is  nonsingular  and  therefore  the  only  solution  to  Cy  -  0  is  y=Q 
which  mean  Q^1  is  zero.  This  contradicts  our  assumption  that  0  is  non- 
singular.  Therefore  the  matrices  Q  and  B  together  have  at  least  nm+n 
nonzero  elements.  If  we  assume  that  all  the  zeroes  arc  in  B,  we  still 
remain  with  n(m-rl)-nz  nonzero  elements  in  B. 


Comment:  Any  Vandermonde  matrix  satisfies  the  conditions  of 
lemma  3.2  therefore  there  arc  infinitely  many  examples  of  matrices  For 
which  cne  cannot  expect  to  get  more  than  nz  -  n  zeroes  in  B . 

Next  we  introduce  a  practical  method  to  factorize  a  full  ranked 
matrix  A  with,  at  least,  nz-n  zero  elements  in  3 . 
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The  factorization 


Let  AeRmxn  be  a  full  rank  matrix  where  m>n  .  Then  we  can  write 


A  = 


(3.15) 


Suppose  Ai  is  nonsingular  nxn  matrix.   In  this  case  we  can  take 


B  = 


&    ■  A  -1 
Jig  A\ 


Q  =  [Al] 


(3.16) 


and  there  are  n2-n  zeroes  in  B .  However,  this  factorization  is  the  worst 
case  cf  section  1.  In  order  to  accomplish  a  better  factorization,  let  as 
assume  that  m  >2n  in  this  case  we  can  write  the  matrix  as  follow: 


A       M 


(3.17) 


where  Ai^RnXn  is  a  nonsingular  matrix,  A3^Rniin  and  A2zR!>m~'l7l'i*r'.  Assume 
that  A$A{1  can  be  factorized  into  L-  U  where  L  and  U  are  lower  and  upper 
triangular  matrices  respectively. 


B  = 


Q  -  U-Ax 


(3.  IS) 


will  give  us  a  factorization  with  n2-n  zeroes  in  B  and  its  form  will  be 
closer  to  uniform  distribution  of  the  zero  elements  among  the  rows  of  the 
matrix. 

It  is  interesting  to  observe  cases  in  which  the  matrix  A  is  not  of  full 
rank.  "We  will  shew  that  in  some  cases  it  is  possible  to  achieve  more  zeroes 
than  the  full  rank  case  and  in  other  cases,  the  opposite  is  true. 
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Lemma  3.3:  Let  A^RTnyn  where  ran/c(A)  <  n.  A  sufficient  condition  lor  A  to 
have  a  factorization  A  =  BQ  ,  where  QERnXn  is  a  nonsingular  matrix  and 
B£Rm'An  such  that  Z(5)  >  n-  -  n  is  that 

ranlc(A)  +  n  <  m  +  1  (3.19) 

Proof:  Suppose  that  rank(A)  =  k  ,l<Kn,  Without  loss  of  generality  we 
may  assume  that  the  first  k  columns  of  A  are  linearly  independent  and 
the  last  {n-k)  columns  are  linear  combinations  of  the  first  k  columns.  Let 
us  write  A  =  [Ai,Az]  where  Ai^Rmitk  and  Az^Rm^n~k^-  There  exists  a  matrix 
£-e£*x(7i-fc)  such  that  A2  =  Ai'E  .  The  matrix  /I]  can  be  factorized  to 
Ax  =  B-L-Qx  according  to  (3.16)  where  B^zE771*-  has  kz—k  zero  elements,  and 
Qi€.Rk*k  a  nonsingular  matrix.  Let  BzRm*m  be  the  matrix  with  Bi  in  its 
first  K  columns  and  zeroes  in  its  last  (n-k)  columns  and  let 

Q  =  \^   Qf\  (3-20) 

Since  0Y  is  nonsingular,  Q  is  nonsingular  and  .4  =  BO.  In  this  case  B  has 
at  least  k2  -  k  +  m(n  -  k)  zeroes.  Recall  that  the  number  of  zeroes  in  B  in 
the  full  rank  case  is  n2  -  n  ,  it  follows  that  k2  -  k  +  m(n  -  k)  >  n2  -  n  iff 
kz  -  k(m  +  1)  +■  n(m  +  1)  -  n2  >  0  iff  k2  -  n2  >  (m  +  i)(k  -  n)  .  Since  k  <  n 
the  last  inequality  will  hold  iff  k  +  n  <m  +  1.  This  inequality  is  the 
sufficient  condition  in  (3.19). 

Conclusions 
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We  have  seen  in  this  paper  a  class  of  optimization  problems  for  which 
the  Hessian  matrix  can  be  written  as  A7  -D-A  where  A<ERm*n  and  D^R171*™-  a 
diagonal  matrix.  We  showed  that  in  several  cases,  the  matrix  .4  can  be 
partially  designed  by  the  user  in  order  to  reduce  the  number  of  nonzero 
elements  to  a  minimum.  In  previous  sections  we  explored  the  pattern  of  a 
sparse  matrix  with  a  given  number  of  nonzero  elements.  We  showed  that 
in  order  to  minimize  the  computational  complexity  of  AT-D-A  we  should 
divide  the  nonzero  elements  uniformly  among  the  rows  of  A  and  if  the 
nonzero  elements  are  confined  in  certain  rows  then  the  computational 
complexity  is  maximized. 

The  difference  between  the  evaluation  of  the  product  AT-A  by  method 
of  dense  matrices  and  the  upper  bound  for  the  worst  case  using  sparse 
method  is  presented  in  (2.40).  It  can  be  seen  that  this  difference  depends 

77171  —  N 


linearly  on  the  proportion  of  zero  elements  in  the  matrix  which  is 


772,7! 


.  Furthermore,  the  saving  in  using  sparse  method  is,  at  least,  ^(71+2)77171 
times  this  proportion.  Since  }£(n+ 2)77171  and  (2.38)  are  both  close  for  large 
77i  and  n,  the  saving  is  at  least  the  number  of  operations  for  the  dense 
case  times  the  proportion  of  the  zeroes  elements. 

Finally  we  demonstrated  a  practical  method  for  factorizing  a  full 
ranked  matrix  Aei?mxn  into  B-Q  where  B  has  at  least  712  -  n  zero  ele- 
ments. Furthermore,  we  presented  a  class  of  matrices  A  for  which  you 
cannot  expect  to  get  more  than  712  -  71  zero  elements. 
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Unfortunately  ,  this  factorization  is  not  optimal  since  the  nonzero 
elements  are  not  distributed  uniformly  among  the  rows  and  this  question 
is  still  without  an  answer.  Secondly,  we  proved  that  we  can  achieve  at 
least  n2  -n  zero  elements  in  B  if  A  is  full  ranked  or  ranJc(A)  +  n  <  m  +  1  . 
\\Te  did  not  prove  anything  for  matrices  which  are  not  full  rank  and  do  not 
satisfy  (3.19)  .  The  author  conjecture  is  that  the  theorem  may  apply  also 
for  this  case. 
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